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Abstract — We present Kinodynamic RRT*, an incremental 
sampling-based approach for asymptotically optimal motion 
planning for robots with linear differential constraints. Our 
approach extends RRT*, which was introduced for holonomic 
robots 1 8 1, by using a fixed-final-state-free-final- time controller 
that exactly and optimally connects any pair of states, where the 
cost function is expressed as a trade-off between the duration 
of a trajectory and the expended control effort. Our approach 
generalizes earlier work on extending RRT* to kinodynamic sys- 
tems, as it guarantees asymptotic optimality for any system with 
controllable linear dynamics, in state spaces of any dimension. 
Our approach can be applied to non-linear dynamics as well 
by using their first-order Taylor approximations. In addition, 
we show that for the rich subclass of systems with a nilpotent 
dynamics matrix, closed-form solutions for optimal trajectories 
can be derived, which keeps the computational overhead of 
our algorithm compared to traditional RRT* at a minimum. 
We demonstrate the potential of our approach by computing 
asymptotically optimal trajectories in three challenging motion 
planning scenarios: (i) a planar robot with a 4-D state space and 
double integrator dynamics, (ii) an aerial vehicle with a 10-D 
state space and linearized quadrotor dynamics, and (iii) a car- 
like robot with a 5-D state space and non-linear dynamics. 

I. Introduction 

Much progress has been made in the area of motion 
planning in robotics over the past decades, where the basic 
problem is defined as finding a trajectory for a robot between 
a start state and a goal state without collisions with obstacles 
in the environment. The introduction of incremental sampling- 
based planners, such as probabilistic roadmaps (PRM) | 9 1 and 
rapidly-exploring random trees (RRT) |11| enabled solving 
motion planning problems in high-dimensional state spaces 
in reasonable computation time, even though the problem is 
known to be PSPACE-hard |10|. PRM and RRT are asymp- 
totically complete, which means that a solution will be found 
(if one exists) with a probability approaching 1 if one lets 
the algorithm run long enough. More recently, an extension of 
RRT called RRT* 1 8 1 was developed that achieves asymptotic 
optimality, which means that an optimal solution will be found 
with a probability approaching 1. 

While RRT* has successfully been applied in practice |[T6l . 
a key limitation of RRT* is that it is applicable only to 
systems with simple dynamics, as it relies on the ability to 
connect any pair of states with an optimal trajectory (e.g. 
holonomic robots, for which straight lines through the state 
space represent feasible motions). For kinodynamic systems, 
however, straight-line connections between pairs of states are 
typically not valid trajectories due to the system's differential 
constraints. Finding a feasible trajectory between two states 




Fig. 1. An asymptotically optimal trajectory computed by our algorithm for 
a quadrotor helicopter with linearized dynamics in a 10-D state space. 

for differentially constrained systems is known as the two- 
point boundary value problem 1 12], and is non-trivial to solve 
in general. Numerical approaches, such as the shooting method 
|T, Tl, are computationally intensive and their solutions may 
not satisfy any notion of optimality. Prior works on extending 
RRT* for kinodynamic systems have therefore focused on 
simple specific instances of kinodynamic systems |7 |, or have 
serious limitations as they do not in fact succeed to compute 
an optimal trajectory between any pair of states (Sj [T3 (see 
our discussion in Section [II]). 

In this paper, we present Kinodynamic RRT*, an extension 
of RRT* that overcomes the above limitations by introducing 
into the algorithm a fixed-final- state-free-final-time controller 
1 13] that exactly and optimally connects any pair of states for 
any system with controllable linear dynamics in state spaces 
of arbitrary dimension. Our approach finds asymptotically 
optimal trajectories in environments with obstacles and bounds 
on the state and control input, with respect to a cost function 
that is expressed as a tunable trade-off between the duration 
of the trajectory and the expended control effort. Moreover, 
we show that for the rich subclass of systems with a nilpotent 
dynamics matrix, expressions for optimal connections between 
pairs of states can be derived in closed-form, and can hence 
be computed quickly. This means that our algorithm computes 
asymptotically optimal trajectories for such kinodynamic sys- 
tems at little additional computational cost compared to RRT* 
for holonomic robots. Also, our approach can handle non- 
linear dynamics by linearizing them about the state that is 



sampled in each iteration of the algorithm. 

We note that while we focus our presentation on extending 
RRT* to kinodynamic systems, also the application of PRM 
191 and path smoothing by iterative shortcutting (H have thus 
far been limited to holonomic systems, for these methods too 
require connecting pairs of states by feasible trajectories. Our 
approach is equally suited for making PRM and smoothing 
applicable to robots with differential constraints, and may 
particularly align well with recent interest in constructing 
roadmaps containing near-optimal trajectories [141. 

We demonstrate the potential of our approach by computing 
asymptotically optimal trajectories in three challenging motion 
planning scenarios: (i) a planar robot with a 4-D state space 
and double integrator dynamics, (ii) an aerial vehicle with a 
10-D state space and linearized quadrotor dynamics (see Fig. 
[T]), and (iii) a car-like robot with a 5-D state space and non- 
linear dynamics. 

The remainder of this paper is organized as follows. We 
begin by discussing related work in Section [ll] and formally 
defining the problem we discuss in this paper in Section [Hi] 
Section |IV] describes how an optimal trajectory is computed 
between any pair of states, and Section |V] describes our 
adapted RRT* algorithm. We describe the extension to non- 



linear dynamics in Section VI discuss experimental results in 
Section |VII| and conclude in Section |VIII| 

II. Related Work 

The term kinodynamic planning was first introduced in 1993 
in f2l, which presented a resolution-complete algorithm for 
optimal planning of robots with discretized double integrator 
dynamics in low-dimensional workspaces. Kinodynamic plan- 
ning has since been an active area of research. Incremental 
sampling-based algorithms, in particular the rapidly-exploring 
random tree (RRT) approach ifTTIl . proved to be effective in 
state spaces of high dimensionality, and is applicable to general 
dynamics systems as it builds a random tree of trajectories, 
and complex dynamics can be forward integrated to expand 
the tree. 

Unfortunately, RRT does not produce optimal trajectories. 
In fact, the probability that it finds an optimal path is zero 
Ip. Recently, RRT* was introduced to overcome this problem 
and guarantees asymptotic optimality |6|; it iteratively builds 
a tree of trajectories through the state space whose probability 
of containing an optimal solution approaches 1 as the number 
of iterations of the algorithm approaches infinity. However, 
RRT* requires for its rewiring step critical to achieving 
asymptotic optimality that any pair of states can be optimally 
connected. Therefore, RRT* was introduced for holonomic 
systems, where any pair of states can be optimally connected 
by a straight-line trajectory through the state space. 

Several attempts have been made to extend RRT*'s ap- 
plicability to kinodynamic systems for which a straight-line 
connection between a pair of states is typically not a valid 
trajectory due to the system's differential constraints. In Q, 
sufficient conditions were established to ensure asymptotic 
optimality of the RRT* algorithm for systems with differential 



constraints, and it was shown how to apply RRT* to two 
specific instances of kinodynamic systems: the Dubin's car and 
the double integrator. The approach of | 3 1 generalizes this to 
arbitrary kinodynamic systems, but has several limitations: to 
connect pairs of states, it uses the shooting method 1 1 1 with a 
constant control input, which is inherently suboptimal. Also, 
the shooting method can only reach a neighborhood of the final 
state in general, which requires costly repropagation of the 
trajectories in the tree descending from such states. Recently, 
LQR-RRT* was proposed in ifTTl . and uses an infinite-horizon 
LQR controller to connect pairs of states. Unfortunately, also 
this controller will not reach the final state exactly or within 
a guaranteed neighborhood, even in case of linear dynamicsj^ 
which inherently leads to suboptimality. 

Our approach improves upon this prior work by connecting 
any pair of states exactly and optimally for systems with 
controllable linear dynamics, which guarantees that asymptotic 
optimality is in fact achieved. We accomplish this by extending 
the well-studied formulation for a fixed final state and fixed 
final time optimal control problem 1 13 | to derive an optimal, 
open-loop, fixed final state free final time control policy. A 
similar approach has been adopted by ifTSl for extending RRTs 
in state space under a dynamic cost-to-go distance metric 0. 
In comparison to the latter work, we present a numerical 
solution that is guaranteed to find a global optimum for the 
general case, and an efficient closed-form solution for the 
special case of systems with a nilpotent dynamics matrix. 

III. Problem Definition 

Let X = R'' SindU = R'^ be the state space and control 
input space, respectively, of the robot, and let the dynamics 
of the robot be defined by the following linear system, which 
we require to be formally controllable: 



±[t] = Ax[t] + Bu[t] + c, 



(1) 



where x[t] G A' is the state of the robot, u[t] eU is the control 
input of the robot, and A e M^><^, B e R^><^, and c G 
are constant and given. 

A trajectory of the robot is defined by a tuple tt = 
(x[],u[],r), where r is the arrival time or duration of the 
trajectory, u : [0, r] U defines the control input along the 
trajectory, and x : [0,r] ^ A' are the corresponding states 
along the trajectory given x[0] with ±[t] = Ax[t] -\- Bu[t]-\-c. 

The cost c[7t] of a trajectory tt is defined by the function: 



{1 ^ u[t]^ Ru[t]) dt, 



(2) 



which penalizes both the duration of the trajectory and the 
expended control effort, where R G W^^^ is positive-definite, 
constant, and given, and weights the cost of the control inputs 
relative to each other and to the duration of the trajectory. 

^Imagine a system with double integrator dynamics where the state is 
defined by the robot's position and velocity: an infinite-horizon LQR controller 
can either reach a state with a specified velocity, or a state with a specified 
position, but cannot satisfy both as time approaches infinity. 



Let Afree C X define the free state space of the robot, 
which consists of those states that are within user-defined 
bounds and are colHsion-free with respect to obstacles in the 
environment. Similarly, let ^/free C U define the free control 
input space of the robot, consisting of control inputs that are 
within bounds placed on them. This brings us to the formal 
definition of the problem we discuss in this paper: given a 
start state Xgtart ^ ^free and a goal state Xgoai ^ ^free, find 
a collision-free trajectory TTf^^^ between Xgtart and Xgoai with 
minimal cost: 

TTfree = argmin{7r | x[0] = Xgtart A X[r] = Xgoal A 

V{t G [0, r]} (X[t] G A'free A u[t] G ZYfree)} c^. (3) 

We note that the cost function of Eq. (|2]) obeys the optimal 
substructure property, let 7r*[xo,xi] = (x[],u[],r) be the 
optimal trajectory between xq G A' and xi G A', irrespective 
of bounds and obstacles, and let c*[xo,xi] be its cost: 

c*[xo,xi] = min{7r|x[0] = xq A x[r] = xi} c\i\\. (4) 
7r*[xo,xi] = argmin{7r | x[0] = xq A x[r] = Xi} c[7r], (5) 

then for all < t < r we have: c*[xo,xi] = c*[xo,x[t]] + 
c*[x[t],xi]. Hence, an optimal collision-free trajectory TTf^^^ 
between start and goal consists of a concatenation of 
optimal trajectories between a series of successive states 

(^start ■) Xi, X2, . . . , Xgoal) in Afree- 

IV. Optimally Connecting a Pair of States 

A critical component of our approach to solve the problem 
as defined in Eq. ([3]) is to be able to compute the optimal 
trajectory tt* [xq, xi] (and its cost c* [xq, xi]) between any two 
states xo G A' and xi G X, as defined in Eqs. ([5]) and (|4]). In 
this section we discuss how to compute these. It is known from 
(131 what the optimal control policy is in case a fixed arrival 



time r is given, as we review in Section IV-A We extend this 



analysis to find the optimal free arrival time in Section |IV-B 
and show how to compute the corresponding optimal trajectory 



in IV-C We discuss practical implementation in Section IV-D 



A. Optimal Control for Fixed Final State and Fixed Final Time 

Given a fixed arrival time r and two states xq and xi, we 
want to find a trajectory (x[], u[],r) such that x[0] = xq, 
x[r] = xi, and x[t] = Ax[t] + 5u[t] + c (for all < t < r), 
minimizing the cost function of Eq. ([2]). This is the so-called 
fixed final state, fixed final time optimal control problem. 

Let G\f\ be the weighted controllability Gramian given by: 

G[t] = [ exp[A{t-t')]BR-^B^ exp[A^{t-t')]dt\ (6) 
Jo 

which is the solution to the Lyapunov equation: 

G[t] = AG[t] + G[t]A^ + BR-^B^, G[0] = 0. (7) 

We note that G[t\ is a positive-definite matrix for t > if the 
dynamics system of Eq. ([T]) is controllable. 



Further, let x[t] describe what the state x (starting in xq at 
time 0) would be at time t if no control input were applied: 

x[t] = exp[At]xo + / exp[A(t - t')]c dt' , (8) 

which is the solution to the differential equation: 

= Ax[t] + c, x[0]=xo. (9) 

Then, the optimal control policy for the fixed final state, 
fixed final time optimal control problem is given by: 

M[t] =i?-^5^exp[A^(r-t)]G[r]-^(xi -x[r]), (10) 

which is an open-loop control policy. We refer the reader to 
1 13 1 for details on the derivation of this equation. 

B. Finding the Optimal Arrival Time 

To find an optimal trajectory 7r*[xo, xi] between xq and xi 
as defined by Eq. ([5]), we extend the above analysis to solve 
the fixed final state, free final time optimal control problem, 
in which we can choose the arrival time r freely to minimize 
the cost function of Eq. ([2]). 

To find the optimal arrival time r*, we proceed as follows. 
By filling in the control policy of Eq. ([TO]) into the cost 
function of Eq. ^ and evaluating the integral, we find a 
closed-form expression for the cost of the optimal trajectory 
between xq and xi for a given (fixed) arrival time r: 

c[r] = r + (xi - x[r])^G[r]-i(xi - x[r]). (11) 

The optimal arrival time r* is the value of r for which this 
function is minimal: 



r* = argmin{r > 0} c[r], 



(12) 



and the cost of the optimal trajectory between xq and xi as 
defined in Eq. ^ is given by c*[xo,xi] = c[r*]. 

The optimal arrival time r* is found by taking the derivative 
of c[t] with respect to r (which we denote c[r]), and solving 
c[t] =0 for r. The derivative is given by: 

c[t] = 1 - 2(Axi + cfd[r] - d[rfBR-^B^d[r], (13) 

where we define: 

d[r]=G[r]-i(xi-x[r]). (14) 

It should be noted that the function c[r] may have multiple 
local minima. Also, note that c[r] > r for all r > 0, since 
G[t] is positive-definite (see Fig. [2]). 

C. Computing the Optimal Trajectory 

Given the optimal arrival time r* as defined above, 
we find the corresponding optimal trajectory 7r*[xo,xi] = 
(x[], u[], r*), as defined in Eq. ([5]), as follows. Let us define: 

y[t]=exp[A^iT* -t)]d[T*], (15) 

such that the optimal control policy (see Eq. ( fTol i) is given by: 



u[t] = R-^B'y[t] 



(16) 




Fig. 2. Plot of the function c[r] for a robot moving on the 1-D line with 
double integrator dynamics (A = ((0, 1), (0, 0)), B = (0,1)^, c = 0, 
R = 1) between xq = (0, 0)^ and xi = (1, 1)^. The optimal arrival time 
is T* = V7 — 1 ~ 1.65, which is where the function c[r] is minimal. 



Filling in this optimal control policy into Eq. ([T]) gives us the 
differential equation for the state x[]: 

±[t] = Ax[t] + BR-^B^y[t] + c, x[r*] = xi. (17) 

Noting that Eq. ( p3] ) is the solution to the differential equation: 

y[t] = -A^y[t], y[T*]=d[T*], (18) 

and combining this with Eq. ^Tl) gives us the composite 
differential equation: 



A BR-^B^ 
-A'^ 
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(19) 



which has as solution: 
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dt'. (20) 



This gives us x[t] and, using Eq. ([16]), \i[t] for all < t < r*, 
which completely determines 7r*[xo,xi]. 

D. Practical Implementation 

For the implementation of the above computations in prac- 
tice, we distinguish the special case in which matrix A is 
nilpotent, in which case we can derive a closed-form solution 
for the optimal trajectory, from the general case, in which case 
we can find the optimal trajectory numerically. 

If matrix A G R^^^ is nilpotent, i.e. A'^ = 0, which is 



not uncommon as we will see in Section VII exp [At] has a 
closed-form expression in the form of an (n — 1) -degree matrix 
polynomial in As a result, the integrals of Eqs. ^ and ^ 
can be evaluated exactly to obtain closed-form expressions 
for G[r] and x[r]. Solving c[r] =0 for r to find the optimal 
arrival time r* then amounts to finding the roots of a (high- 
degree) polynomial in r. Various methods exist to find all roots 
of a polynomial Q, which gives us the global minimum of c[r] 
and the corresponding optimal arrival time r*. Subsequently, 
the nilpotence of A implies that the matrix [ ^ ^^j^t ] is 
nilpotent as well, which means that Eq. ( [2Q| ) can be evaluated 
exactly to obtain a closed form expression for the optimal 



trajectory (states and control inputs) between any two states 
Xq and Xl. 

For a general (not nilpotent) matrix A, we integrate G[t] 
and x[t] forward in time according to Eqs. ^ and ^ using 
the 4th-order Runge-Kutta method |1|, which gives us G[r], 
x[r], and c[r] for increasing r > 0. We keep track of the 
minimal cost c* = c[r] we have seen so far and the corre- 
sponding arrival time, as we perform the forward integration 
for increasing r > 0. Since c[r] > r for all r > 0, it suffices 
to terminate the forward integration at r = c* to guarantee that 
a global minimum c* of c[r], and the corresponding optimal 
arrival time r*, has been found. This procedure also gives us 
d[r*], which we use to subsequently reconstruct the optimal 
trajectory between xq and xi, by integrating the differential 
equation ([19]) backward in time for r* > t > using 4th-order 
Runge-Kutta. 

V. KiNODYNAMIC RRT* 

To find the optimal collision-free trajectory n^^^^ as defined 
in Eq. ([3]), given the ability to find an optimal trajectory 
between any pair of states as described above, we use an 
adapted version of RRT*, since RRT* is known to achieve 
asymptotic optimality\ that is, as the number of iterations of the 
algorithm approaches infinity, the probability that an optimal 
path has been found approaches 1. The algorithm is given in 
Fig. [3] 

The algorithm builds a tree T of trajectories in the free 
state space rooted in the start state. In each iteration i of the 
algorithm, a state x^ is sampled from the free state space Xhee 
(line [3]) to become a new node of the tree (line[TO]). For each 
new node a parent is found among neighboring nodes already 
in the tree, i.e. the nodes x for which c* [x, x^] < r for some 
neighbor radius r. The node x that is chosen as parent is 
the node for which the optimal trajectory 7r[x, x^] to the new 
node is collision-free (i.e. the states and control inputs along 
the trajectory are in the respective free spaces) and results in 
a minimal cost between the root node (Xgtart) and the new 
node (lines [4j[6l ). Subsequently, it is attempted to decrease the 
cost from the start to other nodes in the tree by connecting the 
new node to neighboring nodes in the tree, i.e. the nodes x for 
which c* [x^, x] < r. For each state x for which the connection 
is collision-free and results in a lower cost to reach x from 
the start, the new node x^ is made the parent of x (lines |7][9]). 
Then, the algorithm continues with a new iteration. If this is 
repeated indefinitely, an optimal path between Xgtart and Xgoai 
will emerge in the tree. 

The algorithm as given in Fig. [3] differs subtly from the 
standard RRT* algorithm. First of all, we have defined our 
problem as finding a trajectory that exactly arrives at a goal 
state, rather than a goal region as is common in RRT*. As 
a consequence, we explicitly add the goal state to the set of 
states that is considered for a forward connection from a newly 
sampled node in line |7j even if the goal is not (yet) part of 
tree. Also, typical RRT* implementations include a "steer" 
module, which lets the tree grow towards a sampled state (but 
not necessarily all the way), and adds the endpoint of a partial 



KlNODYNAMICRRT*[Xstart ^ ^free,Xgoal ^ A'free] 
1: {Xstart}. 

2: for i G [1, oo) do 

3: Randomly sample G A'free. 

4: X ^ argmin{x G T | c* [x, x^] < r A 

COLLISIONFREE [tT* [x, Xi] ] } (cost [x] + C* [x, Xi] ) . 

5: parent [x^] ^ x. 

6: cost[xi] ^ cost[x] + c*[x, Xi]. 

7: for all {x G T U {xgoai} I c*[xi,x] < r A cost[x^] + 

c*[xi,x] < cost[x] A COLLlSlONFREE[7r*[xi,x]]} do 
8: cost[x] ^ cost[x^] + c*[xi,x]. 
9: parent [x] ^ x^. 
10: r ^ r U {xj. 

Fig. 3. The adapted RRT* algorithm. The tree T is represented as a set 
of states. Each state x in the tree has two attributes: a pointer parent [x] 
to its parent state in the tree, and a number cost[x] which stores the cost 
of the trajectory in the tree between the start state and x. Further, we define 

COLLISIONFREE[x[], u[],r] = V{t G [0,r]} (x[t] G AffreeAuft] G Uiree)- 

trajectory as node to the tree |8|. Since it is non-trivial given 
our formulation to compute a partial trajectory of a specified 
maximum cost, our algorithm attempts a full connection to the 
sampled state, and adds the sampled state itself as a node to 
the tree. These changes do not affect the asymptotic optimality 
guarantee of the algorithm. 

In the original RRT* algorithm, the neighbor radius r can 
be decreased over the course of the algorithm as a function 
r = {{j/Cd) \og[i]/iy^^ of the number of nodes i currently in 
the tree, without affecting the asymptotic optimality guarantee, 
where d is the dimension of the state space, Q is the volume 
of a d-dimensional unit ball, 7 > 2^(1 + l/d)fi[Xfree], and 
/i[Afree] is the volumc of the state space |6|. For non-Euclidean 
distance measures, such as our function c* [xq, xi], let 7^[x, r] 
be the set of states that can reach x or are reachable from x 
with cost less than r: 

7^[x,r] = {x' G A'|c*[x,x'] < r Vc*[x^x] < r}. (21) 

Then the neighbor radius r must be set such that a ball of 
volume 7log[z]/z is contained within 7^[x, r] |t7|. A finite 
radius r always exists in our case such that this holds, since we 
require that the system's dynamics are formally controllable. 

VI. Systems with Non-Linear Dynamics 

We have presented our algorithm for linear dynamics sys- 
tems of the type of Eq. ([T]), but we can apply our algorithm 
to non-linear dynamics as well through linearization. Let the 
non-linear dynamics of the robot be defined by a function f : 

xM =f[xM,uM]. (22) 

We can locally approximate the dynamics by linearizing the 
function f to obtain a system of the form of Eq. ([T]), with: 

A= ^[x,u], B = |^[x,u], c = f[x,u] -A^-Bu, 

(23) 

where x is the state and u is the control input about which 
the dynamics are linearized. The resulting linear system is a 



first-order Taylor approximation of the non-linear dynamics, 
and is approximately valid only in the vicinity of x and u. 

We adapt the algorithm of Fig. [3] to non-linear dynamics 
by (re)linearizing f in each iteration of the algorithm about 
X = x^ and u = after a new state x^ is sampled in line |3] 
The resulting linear dynamics are then used in the subsequent 
computations of the functions c* and tt* (note that this only 
works if the linearized dynamics are controllable). We choose 
X = x^ since it is either the start or the end point of any 
trajectory computed in that iteration of the algorithm, and we 
choose u = since the cost function (see Eq. ([2])) explicitly 
penalizes deviations of the control input from zero. 

The linearization is only a valid approximation if the 
computed trajectories do not venture too much away from 
the linearization point. As over the course of the algorithm 
the distances between states get shorter (due to a decreasing 
neighbor radius r) and the trajectories in the tree get more 
optimal (hence having control inputs closer to zero), this 
approximation becomes increasingly more reasonable. It can 
be helpful, though, to let the neighbor radius r not exceed 
a certain maximum within which the linearizations can be 
assumed valid. 

VII. Experimental Results 

We experimented with our implementation on three kino- 
dynamic systems; a double integrator disk robot operating in 
the plane, a quadrotor robot operating in three space, and a 
non-holonomic car-like robot operating in the plane, which 
are discussed in detail in Sections |VII A| |VII-B| and |VII-C| 
respectively. Simulation results are subsequently analyzed in 

Section ivnini 

A. Linear Double Integrator Model 

The double integrator robot is a circular robot capable of 
moving in any direction by controlling its acceleration. Its 
state space is four-dimensional, and its linear dynamics are 
described by: 
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R = rl, (24) 



u = a, and c = 0, where p describes its position in the plane, 
V its velocity, and a its acceleration. Further, we set bounds 
such that p G [0,200] x [0,100] (m), v G [-10, 10]^ (m/s), 
and u = aG [—10, 10]^ (m/s^). The control penalty r was set 
to 0.25 as this permitted the robot to reach velocities near its 
bounds but not frequently exceed them. 

We experimented with this model in the environment of 
Fig. [4] Clearly, A is nilpotent as = 0, so we can use 
both the closed-form and the numerical method for computing 
connections between states. 

B. Linearized Quadrotor Model 

The quadrotor helicopter was modeled after the Ascending 
Technologies' ResearchPilot. Its state x = (p^, v^, r^, w^)^ 
is 12-dimensional, consisting of three-dimensional position p, 
velocity v, orientation r (rotation about axis r by angle ||r||). 



Fig. 4. Asymptotically optimal trajectory for double integrator robot after 
100,000 nodes were added to the tree. 



and angular velocity w. Its dynamics are non-linear (15], but 
are well-linearizable about the hover point of the quadrotor. 
The linearization is (very) sensitive though to deviations 
in the yaw. Fortunately, the yaw is a redundant degree of 
freedom, so in our linearization, we constrain the yaw (and 
its derivative) to zero. This gives a reduced ten-dimensional 
state and three-dimensional control input, with the following 
linearized dynamics: 
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, (26) 



where r and w are two-dimensional (with their third compo- 
nent implicitly zero), g = 9.8m/s^ is the gravity, m is the mass 
of the quadrotor (kg), £ the distance between the center of the 
vehicle and each of the rotors (m), and j is the moment of 
inertia of the vehicle about the axes coplanar with the rotors 
(kg m^). The control input u consists of three components: 
Uf is the total thrust of the rotors relative to the thrust needed 
for hovering, and Ux and Uy describe the relative thrust of 
the rotors producing roll and pitch, respectively. Further, the 
bounds are defined as p G [0,5]^ (m), v G [—5,5]^ (m/s), 
r G [-1, 1]2 (rad), w G [-5,5]^ (rad/s), Uf G [-4.545,9.935] 
(N), and u^^Uy G [—3.62, 3.62] (N). The matrix R was chosen 
such that producing force is penalized equally for each rotor. 

The quadrotor simulations were performed in the envi- 
ronment of Fig. [5] for the linearized dynamics. Clearly, A 
is nilpotent as it is strictly upper diagonal, so we can use 
both the closed-form and the numerical method for computing 
connections between states. 

C. Non-Linear Car-Like Model 

The car-like robot has a five-dimensional state x = 
(x, 6>, V, K,)^, consisting of its planar position (x, y) (m), its 
orientation (rad), speed v (m/s), and curvature k, (m~^). The 



Fig. 5. Asymptotically optimal trajectory for quadrotor robot after 60,000 
nodes were added to the tree. 

control input u = {uy^u^)^ is two-dimensional and consists 
of the derivatives of speed and curvature, respectively. The 
dynamics are non-linear, and described by x = f[x, u], with 
f given by: 
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For this system, we repeatedly linearize the dynamics about 
the last sampled state, as described in Section |Vl] Let this 
state be X = (f , ^, 6>, i), ^)^, then the linearized dynamics are 
given by: 



, (28) 



and c = f [x, 0] — Ax, where bounds were set such that 
G [0,200], py G [0,100], G [-7r,7r], v G (0,10], 
K, G [—0.25, 0.25]. We note that the velocity must be non-zero, 
otherwise the resulting linear dynamics are not controllable. 

The car-like robot experiments were performed in the en- 
vironment of Fig. [6j the same environment as the double 
integrator. Also in this case, the dynamics matrix A is nilpotent 
for all linearizations, so we can use both the closed-form 
and the numerical method for computing connections between 
states. 

D. Analysis of Results 

We used our algorithm to compute asymptotically optimal 
trajectories for the double integrator, the quadrotor, and the 
car-like robots. They are shown in Figs. |4] [5] and [6] While 
the paths found for the double integrator and quadrotor robots 
appear continuous and smooth, in the case of the car-like robot 




Fig. 6. Asymptotically optimal trajectory for the car-like robot after 60,000 
nodes were added to the tree. 



TABLE I 

Timing data for the first 5000 nodes of each simulation (s). 







Closed Form 




Runge Kutta 4 




nodes 
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QuadRtr 


Car 


Dblint 


QuadRtr 


Car 


1000 


19.75 


116.2 


5.416 


969.8 


4644 


274.8 


2000 


43.26 


274.9 


12.43 


3638 


10819 


331.2 


3000 


72.59 


507.3 


21.74 


7872 


20284 


405.6 


4000 


108.6 


841.6 


31.99 


13479 


33401 


497.4 


5000 


150.4 


1168 


42.94 


20772 


68703 


606.3 



effects of linearization are clearly visible; the robot appears to 
skid sideways to some extent, as if drifting through the curves. 

Table [l| shows the time required to expand the first 5,000 
nodes for all three systems using both the closed form method 
and the numerical 4th-order Runge-Kutta (RK4) method for 
computing connections between states. It is clear that the 
closed-form method executed much more quickly than the 
RK4 method in all cases. On average, we see a factor of 45 (!) 
difference in running time. Using either of the two methods 
resulted in solutions with comparable costs after expanding 
the same number of nodes; variations that occurred were a 
result of numerical errors. 

We also see that nodes were less quickly processed for the 
double integrator than for the car-like robot, despite a lower 
dimensionality. This is because for the double integrator and 
quadrotor experiments, we used a neighbor radius of r = oo, 
while in the case of the non-holonomic car-like robot only 
connections to states within a tight radius (approximately 
corresponding to connections within one width of the road) 
were accepted. This radius was imposed on the car-like system 
to ensure short connections as the linearization breaks down 
over large distances, but it also demonstrates the positive effect 
of using a reduced radius on performance. Processing nodes 
for the quadrotor experiment appeared most computationally 
intensive. This is a result of the high-dimension of its state 
space. The numbers of Table |l| also highlight the quadratic 
nature of the algorithm: the total accumulated running time is 
a quadratic function of the number of nodes that have been 
added to the tree. 

Figure [T] shows the cost of the current-best solution for each 
experiment as more nodes are added to the tree. A total of 
100,000 nodes were expanded in the double integrator robot 
simulation, 60,000 nodes in the quadrotor robot simulation, 
and 200,000 nodes in the non-holonomic robot simulation. In 
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Fig. 7. Graphs showing the cost of the current-best solution as a function of 
the number of nodes in the tree for (from left to right) the double integrator, 
the quadrotor, and the car- like robot experiments. 

all cases we see that a high cost solution is found in relatively 
few nodes, and that these solutions are quickly refined as 
a result of the RRT* rewiring procedure. Necessarily these 
refinements plain off as the solutions approach the asymptotic 
optimum. 

VIII. Discussion, Conclusion, and Future Work 

We have presented Kinodynamic RRT*, an incremental 
sampling-based approach that extends RRT* for asymptot- 
ically optimal motion planning for robots with differential 
constraints. Our approach achieves asymptotically optimality 
by using a fixed-final- state-free-final-time optimal control for- 
mulation that connects any pair of states exactly and optimally 
for systems with controllable linear dynamics. We have shown 
that tor the rich subclass of systems with a nilpotent dynamics 
matrix, such trajectories can be computed efficiently, making 
asymptotically optimal planning computationally feasible for 
kinodynamic systems, even in high-dimensional state spaces. 
We plan to make the source code of our implementation 
publicly available for download. 

For our experiments, we have not fully optimized our 
implementation, and we believe that running times can be 
further improved. In particular, extensions suggested in earlier 
work 1 8 1, such as using an admissible heuristic that can be 
quickly computed and provides a conservative estimate of 
the true cost of moving between two states may prune many 
(relatively costly) attempts to connect pairs of states. Such 
a heuristic can then also be used in a branch-and-bound 
technique 1 8 1 to prune parts of tree of which one knows it will 
never contribute to an optimal solution. Further, the constant 
involved in the rate by which the neighbor radius is allowed 
to decrease is difficult to estimate for kinodynamic systems, 
which prompted us to use very conservative radii. Further 
analysis of the reachable set is needed to establish reasonable 
estimations. This would potentially also aid in developing a 
form of efficient neighbor searching for non-Euclidean state 
spaces (we currently use a brute-force approach), which is still 
largely an unexplored area. 

Other areas of potential improvement include studying non- 
uniform sampling to accelerate the convergence to optimal 



solutions. One could sample more heavily around the current 
optimal solution, or use stochastic techniques to infer distri- 
butions of samples that are likely to contribute to an optimal 
trajectory. For a quadrotor helicopter for instance, one can 
imagine that there is a strong correlation between its velocity 
and orientation, which should be reflected in the sampling. 
In addition, we note that, as mentioned in the introduction, 
the ability to connect any pair of states can be used to 
perform trajectory smoothing by iterative shortcutting as post- 
processing step. This may improve the quality of solutions 
further and provide better estimates of the convergence rate of 
the algorithm. 

Lastly, we plan to apply our planner to real-world robots, 
in particular quadrotors. This would require constructing a 
stabilizing controller around the computed trajectory, either 
using traditional techniques such as LQR, or by repeatedly 
computing reconnections between the current state of the robot 
and a state on the trajectory. 
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